[SPARK-57778][SQL] Handle Oracle JDBC objects that are not selectable as tables#56898
[SPARK-57778][SQL] Handle Oracle JDBC objects that are not selectable as tables#56898ivan-leventsov wants to merge 2 commits into
Conversation
30f2dac to
54139ed
Compare
MaxGekk
left a comment
There was a problem hiding this comment.
0 blocking, 1 non-blocking, 0 nits.
Clean Oracle error-classification extension that faithfully mirrors the existing isObjectNotFoundException predicate pattern (dialect default + Oracle override + tableExists/resolveTable wiring), kept deliberately separate so "not found" and "not selectable" stay distinct conditions.
Suggestions (1)
JdbcDialects.scala:811:@Since("4.2.0")looks stale — currentbranch-4.xis4.3.0, so a method added now first ships there — see inline.
Verification
- Predicate disjointness (the key correctness point):
OracleDialectoverridesisObjectNotFoundExceptionto match onlyORA-00942/ORA-39165, so it does NOT interceptORA-04044/ORA-04063. The newisNotSelectableObjectExceptionbranch inresolveTableis therefore reachable (not dead), and it precedes theisSyntaxErrorBestEffort(42000) case so a 42000 SQLState can't divert it.tableExistsreturnsfalsefor these objects. - Default
falseleaves non-Oracle dialects unaffected; the error condition is alphabetically placed with a matching<objectName>param and chained cause. Integration tests cover ORA-04044 (procedure + broken-function synonyms) and ORA-04063 (invalid view), asserting bothtableExists=falseand the dedicated error.
| Option(e.getSQLState).exists(_.startsWith("42")) | ||
| } | ||
|
|
||
| @Since("4.2.0") |
There was a problem hiding this comment.
dev/next_version_candidates.py reports the current branch-4.x at 4.3.0 (master 5.0.0), so a method added now first ships in 4.3.0 — 4.2.0 looks like it predates the recent version bump (the existing @Since("4.2.0") entries in this file are from the prior cycle).
| @Since("4.2.0") | |
| @Since("4.3.0") |
… as tables
### What changes were proposed in this pull request?
When using the built-in JDBC `TableCatalog` over Oracle, the driver's table listing
returns objects that cannot be read as tables, e.g. a synonym that resolves to a PL/SQL
procedure/function/package, or an invalid view. Probing such an object (`tableExists` via
`SELECT 1 FROM <obj> WHERE 1=0`, or schema resolution via `SELECT * FROM <obj> WHERE 1=0`)
raises `ORA-04044` ("procedure, function, package, or type is not allowed here") or
`ORA-04063` ("... has errors"). These were unclassified, so `tableExists` threw and table
resolution surfaced a raw `FAILED_JDBC` failure.
This adds a `JdbcDialect.isNotSelectableObjectException` predicate (Oracle recognizes
`ORA-04044`/`ORA-04063`) so:
- `JdbcUtils.tableExists` returns `false` for such objects instead of throwing;
- `JDBCRDD.resolveTable` throws a dedicated, clear error (`JDBC_OBJECT_NOT_SELECTABLE`)
instead of a generic failure.
### Why are the changes needed?
A schema legitimately contains synonyms/views that are not selectable tables; probing them
should not surface a raw external-engine error or make existence checks throw.
### Does this PR introduce any user-facing change?
Yes. Querying such an object now returns `JDBC_OBJECT_NOT_SELECTABLE` instead of a raw
`FAILED_JDBC` error, and `tableExists` returns `false` for it.
### How was this patch tested?
New cases in `OracleIntegrationSuite` (docker-integration-tests): synonym to a procedure,
synonym to a broken function, and an invalid view.
54139ed to
be95c45
Compare
cloud-fan
left a comment
There was a problem hiding this comment.
1 blocking, 2 non-blocking.
Clean change that mirrors the existing isObjectNotFoundException classification pattern; the design (new predicate + dedicated JDBC_OBJECT_NOT_SELECTABLE condition) is the right call, and the new error propagates through loadTable's FAILED_JDBC.LOAD_TABLE wrapper without being masked. One blocking version-stamp fix.
Correctness (1)
- JdbcDialects.scala:811:
@Since("4.2.0")should be@Since("4.3.0")— see inline
Suggestions (2)
- JdbcDialects.scala:812: add Scaladoc documenting the new opt-in hook — see inline
- OracleDialect.scala:59: optional
getErrorCodevs message-substring note — see inline
Verification
Checked that JDBC_OBJECT_NOT_SELECTABLE is the correct outcome rather than reusing TABLE_OR_VIEW_NOT_FOUND: the object exists (synonym→procedure for ORA-04044, INVALID view for ORA-04063), so the not-found message ("cannot be found, verify the spelling") would mislead, and the NoSuchTableException type it carries is silently tolerated by IF-EXISTS logic. Also traced propagation: JdbcUtils.classifyException rethrows SparkThrowable unchanged (JdbcUtils.scala:1382), so the new AnalysisException survives loadTable's wrapper. Branch ordering in resolveTable is correct — the new predicate is checked before isSyntaxErrorBestEffort, which matches Oracle SQLSTATE "42000" that ORA-04044/04063 can also carry.
| } | ||
|
|
||
| @Since("4.3.0") | ||
| def isNotSelectableObjectException(e: SQLException): Boolean = false |
There was a problem hiding this comment.
This is a general opt-in hook (base default false, like isObjectNotFoundException), but only Oracle overrides it today. Consider a short Scaladoc stating the contract, so the Oracle-only impl reads as the first adopter and the next dialect author knows when to override. Placed above the @Since line, e.g.:
/**
* Returns true if the given exception indicates the object exists but cannot be read as a
* table or view (e.g. a synonym that resolves to a procedure, or an invalid view), as opposed
* to not existing at all (see `isObjectNotFoundException`). Dialects override this to recognize
* their own error codes; the default is false.
*/(Left as an illustrative block rather than a one-click suggestion so it doesn't collide with the @Since fix above.)
|
|
||
| override def isNotSelectableObjectException(e: SQLException): Boolean = { | ||
| // ORA-04044: object is not a table (e.g. a synonym to a procedure/function/package). | ||
| e.getMessage.contains("ORA-04044") || |
There was a problem hiding this comment.
Optional, non-blocking: matching on e.getMessage.contains(...) is fragile to message format/locale; e.getErrorCode == 4044 (and 4063) would be more robust. This mirrors the existing isObjectNotFoundException here, so it's consistent either way — only worth changing if you want to harden both together.
There was a problem hiding this comment.
Yeah, makes sense, but I'd leave it as is in this PR and keep it consistent with the existing implementation to minimize risk and avoid potential side effects from this small change (for example, the behavior might differ across driver versions etc..)
What changes were proposed in this pull request?
When using the built-in JDBC
TableCatalogover Oracle, the driver's table listing returns objects that cannot be read as tables, for example a synonym that resolves to a PL/SQL procedure/function/package, or an invalid view. Probing such an object (tableExistsviaSELECT 1 FROM <obj> WHERE 1=0, or schema resolution viaSELECT * FROM <obj> WHERE 1=0) raisesORA-04044("procedure, function, package, or type is not allowed here") orORA-04063("... has errors"). These are currently unclassified, sotableExiststhrows instead of returningfalse, and table resolution surfaces a rawFAILED_JDBCerror.This PR adds a new
JdbcDialect.isNotSelectableObjectExceptionpredicate (defaultfalse;OracleDialectrecognizesORA-04044/ORA-04063), kept separate fromisObjectNotFoundException("object does not exist"). With it:JdbcUtils.tableExistsreturnsfalsefor such objects instead of throwing;JDBCRDD.resolveTablethrows a dedicated, clear error conditionJDBC_OBJECT_NOT_SELECTABLEinstead of a generic failure.Why are the changes needed?
A schema legitimately contains synonyms/views that are not selectable as tables. Probing them should not surface a raw external-engine error or make existence checks throw; such objects should be reported as not a readable table.
Does this PR introduce any user-facing change?
Yes. For a JDBC catalog over Oracle:
JDBC_OBJECT_NOT_SELECTABLE(SQLSTATE42000) instead of a rawFAILED_JDBCerror;tableExistsreturnsfalsefor such an object instead of throwing.How was this patch tested?
New cases in
org.apache.spark.sql.jdbc.v2.OracleIntegrationSuite(docker-integration-tests), covering a synonym to a procedure, a synonym to a broken function (bothORA-04044), and an invalid view (ORA-04063). Each asserts thattableExistsreturnsfalseand that reading the object fails withJDBC_OBJECT_NOT_SELECTABLE.Was this patch authored or co-authored using generative AI tooling?
Yes, co-authored using Cursor (AI assistant).